Goto

Collaborating Authors

 ci cd pipeline


Enhancing Cloud Security through Topic Modelling

Saleh, Sabbir M., Madhavji, Nazim, Steinbacher, John

arXiv.org Artificial Intelligence

Protecting cloud applications is critical in an era where security threats are increasingly sophisticated and persistent. Continuous Integration and Continuous Deployment (CI/CD) pipelines are particularly vulnerable, making innovative security approaches essential. This research explores the application of Natural Language Processing (NLP) techniques, specifically Topic Modelling, to analyse security-related text data and anticipate potential threats. We focus on Latent Dirichlet Allocation (LDA) and Probabilistic Latent Semantic Analysis (PLSA) to extract meaningful patterns from data sources, including logs, reports, and deployment traces. Using the Gensim framework in Python, these methods categorise log entries into security-relevant topics (e.g., phishing, encryption failures). The identified topics are leveraged to highlight patterns indicative of security issues across CI/CD's continuous stages (build, test, deploy). This approach introduces a semantic layer that supports early vulnerability recognition and contextual understanding of runtime behaviours.


AI-Augmented CI/CD Pipelines: From Code Commit to Production with Autonomous Decisions

Baqar, Mohammad, Naqvi, Saba, Khanda, Rajat

arXiv.org Artificial Intelligence

Modern software delivery has accelerated from quarterly releases to multiple deployments per day. While CI/CD tooling has matured, human decision points interpreting flaky tests, choosing rollback strategies, tuning feature flags, and deciding when to promote a canary remain major sources of latency and operational toil. We propose AI-Augmented CI/CD Pipelines, where large language models (LLMs) and autonomous agents act as policy-bounded co-pilots and progressively as decision makers. We contribute: (1) a reference architecture for embedding agentic decision points into CI/CD, (2) a decision taxonomy and policy-as-code guardrail pattern, (3) a trust-tier framework for staged autonomy, (4) an evaluation methodology using DevOps Research and Assessment ( DORA) metrics and AI-specific indicators, and (5) a detailed industrial-style case study migrating a React 19 microservice to an AI-augmented pipeline. We discuss ethics, verification, auditability, and threats to validity, and chart a roadmap for verifiable autonomy in production delivery systems.


The Impact of Software Testing with Quantum Optimization Meets Machine Learning

Bandarupalli, Gopichand

arXiv.org Artificial Intelligence

--Modern software systems' complexity challenges efficient testing, as traditional machine learning (ML) struggles with large test suites. This research presents a hybrid framework integrating Quantum Annealing with ML to optimize test case prioritization in CI/CD pipelines. Leveraging quantum optimization, it achieves a 25% increase in defect detection efficiency and a 30% reduction in test execution time versus classical ML, validated on the Defects4J dataset. A simulated CI/CD environment demonstrates robustness across evolving codebases. Visualizations, including defect heatmaps and performance graphs, enhance interpretability. Software testing is integral to ensuring software quality, accounting for 40-50% of development resources in large-scale systems [1]. The rise of microservices, cloud-native architectures, and continuous integration/continuous deployment (CI/CD) practices has intensified the demand for rapid, reliable testing methods [2].


Advancing Software Security and Reliability in Cloud Platforms through AI-based Anomaly Detection

Saleh, Sabbir M., Sayem, Ibrahim Mohammed, Madhavji, Nazim, Steinbacher, John

arXiv.org Artificial Intelligence

Continuous Integration/Continuous Deployment (CI/CD) is fundamental for advanced software development, supporting faster and more efficient delivery of code changes into cloud environments. However, security issues in the CI/CD pipeline remain challenging, and incidents (e.g., DDoS, Bot, Log4j, etc.) are happening over the cloud environments. While plenty of literature discusses static security testing and CI/CD practices, only a few deal with network traffic pattern analysis to detect different cyberattacks. This research aims to enhance CI/CD pipeline security by implementing anomaly detection through AI (Artificial Intelligence) support. The goal is to identify unusual behaviour or variations from network traffic patterns in pipeline and cloud platforms. The system shall integrate into the workflow to continuously monitor pipeline activities and cloud infrastructure. Additionally, it aims to explore adaptive response mechanisms to mitigate the detected anomalies or security threats. This research employed two popular network traffic datasets, CSE-CIC-IDS2018 and CSE-CIC-IDS2017. We implemented a combination of Convolution Neural Network(CNN) and Long Short-Term Memory (LSTM) to detect unusual traffic patterns. We achieved an accuracy of 98.69% and 98.30% and generated log files in different CI/CD pipeline stages that resemble the network anomalies affected to address security challenges in modern DevOps practices, contributing to advancing software security and reliability.


Automating the Training and Deployment of Models in MLOps by Integrating Systems with Machine Learning

Liang, Penghao, Song, Bo, Zhan, Xiaoan, Chen, Zhou, Yuan, Jiaqiang

arXiv.org Artificial Intelligence

This article introduces the importance of machine learning in real-world applications and explores the rise of MLOps (Machine Learning Operations) and its importance for solving challenges such as model deployment and performance monitoring. By reviewing the evolution of MLOps and its relationship to traditional software development methods, the paper proposes ways to integrate the system into machine learning to solve the problems faced by existing MLOps and improve productivity. This paper focuses on the importance of automated model training, and the method to ensure the transparency and repeatability of the training process through version control system. In addition, the challenges of integrating machine learning components into traditional CI/CD pipelines are discussed, and solutions such as versioning environments and containerization are proposed. Finally, the paper emphasizes the importance of continuous monitoring and feedback loops after model deployment to maintain model performance and reliability. Using case studies and best practices from Netflix, the article presents key strategies and lessons learned for successful implementation of MLOps practices, providing valuable references for other organizations to build and optimize their own MLOps practices.


Deploy Flask app using docker and GitHub actions on Heroku - Dragon Forest

#artificialintelligence

After creating the machine learning model, we can use the Flask framework to create API for web applications. Here I will teach you how to deploy the flask app to Heroku using docker and GitHub actions. With Docker and Github actions, you can create CI/CD pipeline for your machine learning project. Github action is used to create CI/CD pipeline. CI/CD means continuous integration and continuous deployment.


What is MLOps?

#artificialintelligence

Ever liked something on Instagram and then, almost immediately, had related content in your feed? Or search for something on Google and then be spammed with ads for that exact thing moments later? These are symptoms of an increasingly automated world. Behind the scenes, they are the result of state-of-the-art MLOps pipelines. We take a look at MLOps and what it takes to deploy machine learning models effectively. We start by discussing some key aspects of DevOps.


Machine Learning at Scale with Databricks and Kubernetes

#artificialintelligence

Machine Learning Operationalisation (ML Ops) is a set of practices that aim to quickly and reliably build, deploy and monitor machine learning applications. Many organizations standardize around certain tools to develop a platform to enable these goals. One combination of tools includes using Databricks to build and manage machine learning models and Kubernetes to deploy models. This article will explore how to design this solution on Microsoft Azure followed by step-by-step instructions on how to implement this solution as a proof-of-concept. This approach aims to use common open source technologies and can easily be adapted for other cloud platforms.


MLOps for Conversational AI with Rasa, DVC, and CML (Part I)

#artificialintelligence

This is the first part of a series of blog posts that describe how to use Data Version Control (DVC), and Continuous Machine Learning (CML) when developing conversational AI assistants using the Rasa framework. This post is mostly an introduction to these three components, in the next post I'll delve into the code, and how to get everything connected for Rasa MLOps bliss. If you've not heard of Data Version Control (DVC), you've been missing out. DVC is an exciting tool from iterative.ai DVC extends git's functionality to cover your data wherever you want to store it, whether that is locally, on a cloud platform like AWS S3, or a Hadoop File System. Like git, DVC is language agnostic.


Build MLOps workflows with Amazon SageMaker projects, GitLab, and GitLab pipelines

#artificialintelligence

Machine learning operations (MLOps) are key to effectively transition from an experimentation phase to production. The practice provides you the ability to create a repeatable mechanism to build, train, deploy, and manage machine learning models. To quickly adopt MLOps, you often require capabilities that use your existing toolsets and expertise. Projects in Amazon SageMaker give organizations the ability to easily set up and standardize developer environments for data scientists and CI/CD (continuous integration, continuous delivery) systems for MLOps engineers. With SageMaker projects, MLOps engineers or organization administrators can define templates that bootstrap the ML workflow with source version control, automated ML pipelines, and a set of code to quickly start iterating over ML use cases.